An Efficient Map-Reduce Framework to Mine Periodic Frequent Patterns
نویسندگان
چکیده
Periodic Frequent patterns (PFPs) are an important class of regularities that exist in a transactional database. In the literature, pattern growth-based approaches to mine PFPs have be proposed by considering a single machine. In this paper, we propose a Map-Reduce framework to mine PFPs by considering multiple machines. We have proposed a parallel algorithm by including the step of distributing transactional identifiers among the machines. Further, the notion of partition summary has been proposed to reduce the amount of data shuffled among the machines. Experiments on Apache Spark’s distributed environment show that the proposed approach speeds up with the increase in number of machines and the notion of partition summary significantly reduces the amount of data shuffled among the machines.
منابع مشابه
An Efficient Approach to Mine Periodic-Frequent Patterns in Transactional Databases
Recently, temporal occurrences of the frequent patterns in a transactional database has been exploited as an interestingness criterion to discover a class of user-interest-based frequent patterns, called periodic-frequent patterns. Informally, a frequent pattern is said to be periodic-frequent if it occurs at regular intervals specified by the user throughout the database. The basic model of pe...
متن کاملDiscovering Quasi-Periodic-Frequent Patterns in Transactional Databases
Periodic-frequent patterns are an important class of user-interest-based frequent patterns that exist in a transactional database. A frequent pattern can be said periodic-frequent if it appears periodically throughout the database. We have observed that it is difficult to mine periodic-frequent patterns in very large databases. The reason is that the occurrence behavior of the patterns can vary...
متن کاملDiscovering Periodic-Frequent Patterns in Transactional Databases
Since mining frequent patterns from transactional databases involves an exponential mining space and generates a huge number of patterns, efficient discovery of user-interest-based frequent pattern set becomes the first priority for a mining algorithm. In many real-world scenarios it is often sufficient to mine a small interesting representative subset of frequent patterns. Temporal periodicity...
متن کاملPerformance Improvements and Efficient Approach for Mining Periodic Sequential Access Patterns
Surfing the Web has become an important daily activity for many users. Discovering and understanding web users’ surfing behavior are essential for the development of successful web monitoring and recommendation systems. To capture users’ web access behavior, one promising approach is web usage mining which discovers interesting and frequent user access patterns from web usage logs. Web usage mi...
متن کاملAn Efficient Pruning and Filtering Strategy to Mine Partial Periodic Patterns from a Sequence of Event Sets
Partial periodic patterns are commonly seen in real-world applications. The major problem of mining partial periodic patterns is the efficiency problem due to a huge set of partial periodic candidates. Although some efficient algorithms have been developed to tackle the problem, the performance of the algorithms significantly drops when the mining parameters are set low. In the past, the author...
متن کامل